2017-10-13

mysql AB 常见错误

这篇文章旨在记录MySQL Replication的常见错误，包括自己工作中遇到的与网友在工作中遇到的，方面自己及别人以后进行查找。每个案例都是通过Last_IO_Errno/Last_IO_Error或者Last_SQL_Errno/Last_SQL_Error给出错误关键信息，所以以后查找时只需直接ctrl+F查找关键字就行。

1 2	Last_SQL_Errno: 1677 Last_SQL_Error: Column 1 of table 'test.t' cannot be converted from type 'int' to type 'bigint(20)'

解决方法：这个案例是从网上找到的，自己动手实验了一把。从错误信息来看表面上是由于在slave上无法执行一条转换字段类型的SQL语句。实际上并不是有这种语句直接引起的，而是间接引起的（之前某些操作导致主从表字段类型不一致，接下来对这个表进行DML时就会报错）。它的意思是在对这个表t进行DML操作时，发现主从上表结果不一致，比如这里是说在主上t的字段1是int类型，但是从上t的字段1是bigint类型，所以报错。那么为什么要报错呢？这是从安全角度考虑，因为如果字段类型不一致可能会导致数据截断之类的问题。那么解决方法呢？通过参数slave_type_conversions进行控制，它有三种取值：

ALL_LOSSY：仅支持有损转换，什么叫有损？比如一个值本来是bigint存储为9999999999999，现在转换为int类型势必会要截断从而导致数据不一致。
ALL_NON_LOSSY：仅支持无损转换，只能在无损的情况下才能进行转换
ALL_LOSSY,ALL_NON_LOSSY：有损/无算转换都支持

空，即不设置这个参数：必须主从的字段类型一模一样。

注意：前面说的这几中情况都只在binlog_format=ROW的情况下才有效。

1
2

Last_SQL_Errno: 1194
Last_SQL_Error: Error 'Table 'traincenter' is marked as crashed and should be repaired' on query. Default database: 'basketballman'. Query: 'update traincenter set points='4',pointstime='1361912066'  where uid = '1847482697' limit 1'

解决方法：myisam表traincenter损坏，直接repair table即可。至于为什么myisam类型表比innodb更容易损坏，我觉得有两个原因：1，innodb有double write机制，损坏或者half write的页可以用它恢复，第二innodb是事务引擎，都有操作都是事务的，而myisam是非事务的，存在写一半但是操作终止情况。

1 2	Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

解决方法：主库上的binlog文件已经不存在但是在index file中确有相应记录存在。我这里发生这个错误的原因在于由于复制中断时间很长，报警出来一直没人处理，这个中断时间超过master上binlog超期时间，等恢复复制时需要的binlog已经由于其超期而被删掉，没办法只好重建这个实例了。以大家都要引以为戒。

1
2

Last_IO_Errno: 1593
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).

解决方法：主从配置的server-id一样，而在主从复制环境中server-id一样的binlog events都会被过滤掉。具体server-id的含义可以了解一下复制原理。这个一般是因为拷贝配置文件时忘记修改server-id导致，遇到这类问题也比较容易，平时操作谨慎一点即可。

1
2

Last_Errno: 1053
Last_Error: Query partially completed on the master (error on master: 1053) and was aborted. There is a chance that your master is inconsistent at this point. If you are sure that your master is ok, run this query manually on the slave and then restart the slave with SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE; . Query: 'insert into ...

解决方法：查询在master上部分完成，然后终止了。这马上又能想到是myisam表，结果也正是这样。由于myisam不支持事务所以可能存在一个查询完成一部分然后失败的情况。解决方法一般也就是提示信息给出的跳过一个binlog event。不过确认跳过之前最好还是查询一下master上是否真的存在相应的记录，因为错误信息同时还会给出它认为在master上执行一部分然后终止的查询语句。

1 2	Last_SQL_Errno: 1666 Last_SQL_Error: Error executing row event: 'Cannot execute statement: impossible to write to binary log since statement is in row format and BINLOG_FORMAT = STATEMENT.'

解决方法：这个案例的背景是做一个ABC结构的复制，B、C中设定的binlog_format=statement，A中的是MIXED，所以当B尝试重做A过来的relay log，然后记录binlog（传给C）时发现relay log的binlog_format与自己设定的binlog_format不一致。我当时就是直接先更改BC的binlog_format=MIXED解决。

1
2

Last_Errno: 1032
Last_Error: Could not execute Update_rows event on table db.table; Can't find record in 'table', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000064, end_log_pos 158847

解决方法：这个是在binlog_format=row复制下发生的。原因是因为row格式复制是最严格的，所以在mysql看来如果在从库上找不到要更新的这条记录，那么就代表主从数据不一致，因此报错。另外顺便说一句，对于row格式binlog，如果某个更新操作实际上并没有更新行，这个操作是不会记binlog的，因为row格式的binlog宗旨就是只记录发生了改变的行。所以这个解决办法根据你自己实际应用来定，最好的方法还是重做slave吧，这样更放心。

1 2	Last_Errno: 28 Last_Error: Error in Append_block event: write to '/tmp/SQL_LOAD-32343798-72213798-1.data' failed

解决方法：首先说错误原因：主库执行load data infile，同步到从库后load data infile存放的文件默认是放在/tmp(由参数slave_load_tmpdir控制)，而/tmp空间不够因此报错。因此只要将从库上slave_load_tmpdir设置到一个磁盘空间足够大的分区就行。